快速入门:创建自定义语音助手

您所在的位置:网站首页 微软 语音助手 快速入门:创建自定义语音助手

快速入门:创建自定义语音助手

2024-06-05 16:18| 来源: 网络整理| 查看: 265

Java 运行时 Android 先决条件

在开始之前,请务必:

创建语音资源 设置开发环境并创建空项目 创建连接到 Direct Line Speech 通道的机器人 请确保你有权访问麦克风,以便进行音频捕获

注意

请参阅语音助手支持的区域列表,确保你的资源部署在其中一个区域中。

创建并配置项目

创建 Eclipse 项目并安装语音 SDK。

此外,若要启用日志记录,请更新 pom.xml 文件以包含以下依赖项:

org.slf4j slf4j-simple 1.7.5 添加示例代码

若要向 Java 项目添加新的空类,请选择“文件”>“新建”>“类”。

在“新建 Java 类”窗口中,在“包”字段内输入 speechsdk.quickstart,在“名称”字段内输入 Main。

打开新建的 Main 类,将 Main.java 文件的内容替换为以下起始代码:

package speechsdk.quickstart; import com.microsoft.cognitiveservices.speech.audio.AudioConfig; import com.microsoft.cognitiveservices.speech.audio.PullAudioOutputStream; import com.microsoft.cognitiveservices.speech.dialog.BotFrameworkConfig; import com.microsoft.cognitiveservices.speech.dialog.DialogServiceConnector; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.DataLine; import javax.sound.sampled.SourceDataLine; import java.io.InputStream; public class Main { final Logger log = LoggerFactory.getLogger(Main.class); public static void main(String[] args) { // New code will go here } private void playAudioStream(PullAudioOutputStream audio) { ActivityAudioStream stream = new ActivityAudioStream(audio); final ActivityAudioStream.ActivityAudioFormat audioFormat = stream.getActivityAudioFormat(); final AudioFormat format = new AudioFormat( AudioFormat.Encoding.PCM_SIGNED, audioFormat.getSamplesPerSecond(), audioFormat.getBitsPerSample(), audioFormat.getChannels(), audioFormat.getFrameSize(), audioFormat.getSamplesPerSecond(), false); try { int bufferSize = format.getFrameSize(); final byte[] data = new byte[bufferSize]; SourceDataLine.Info info = new DataLine.Info(SourceDataLine.class, format); SourceDataLine line = (SourceDataLine) AudioSystem.getLine(info); line.open(format); if (line != null) { line.start(); int nBytesRead = 0; while (nBytesRead != -1) { nBytesRead = stream.read(data); if (nBytesRead != -1) { line.write(data, 0, nBytesRead); } } line.drain(); line.stop(); line.close(); } stream.close(); } catch (Exception e) { e.printStackTrace(); } } }

在 main 方法中,首先配置 DialogServiceConfig 并使用它来创建 DialogServiceConnector 实例。 此实例会连接到 Direct Line Speech 通道以便与机器人交互。 AudioConfig 实例还用于指定音频输入的源。 在此示例中,对 AudioConfig.fromDefaultMicrophoneInput() 使用了默认麦克风。

将字符串 YourSubscriptionKey 替换为可从 Azure 门户获取的语音资源密钥。 将字符串 YourServiceRegion 替换为与语音资源关联的区域。

注意

请参阅语音助手支持的区域列表,确保你的资源部署在其中一个区域中。

final String subscriptionKey = "YourSubscriptionKey"; // Your subscription key final String region = "YourServiceRegion"; // Your speech subscription service region final BotFrameworkConfig botConfig = BotFrameworkConfig.fromSubscription(subscriptionKey, region); // Configure audio input from a microphone. final AudioConfig audioConfig = AudioConfig.fromDefaultMicrophoneInput(); // Create a DialogServiceConnector instance. final DialogServiceConnector connector = new DialogServiceConnector(botConfig, audioConfig);

连接器 DialogServiceConnector 依赖于多个事件来传达其机器人活动、语音识别结果和其他信息。 接下来请添加这些事件侦听器。

// Recognizing will provide the intermediate recognized text while an audio stream is being processed. connector.recognizing.addEventListener((o, speechRecognitionResultEventArgs) -> { log.info("Recognizing speech event text: {}", speechRecognitionResultEventArgs.getResult().getText()); }); // Recognized will provide the final recognized text once audio capture is completed. connector.recognized.addEventListener((o, speechRecognitionResultEventArgs) -> { log.info("Recognized speech event reason text: {}", speechRecognitionResultEventArgs.getResult().getText()); }); // SessionStarted will notify when audio begins flowing to the service for a turn. connector.sessionStarted.addEventListener((o, sessionEventArgs) -> { log.info("Session Started event id: {} ", sessionEventArgs.getSessionId()); }); // SessionStopped will notify when a turn is complete and it's safe to begin listening again. connector.sessionStopped.addEventListener((o, sessionEventArgs) -> { log.info("Session stopped event id: {}", sessionEventArgs.getSessionId()); }); // Canceled will be signaled when a turn is aborted or experiences an error condition. connector.canceled.addEventListener((o, canceledEventArgs) -> { log.info("Canceled event details: {}", canceledEventArgs.getErrorDetails()); connector.disconnectAsync(); }); // ActivityReceived is the main way your bot will communicate with the client and uses Bot Framework activities. connector.activityReceived.addEventListener((o, activityEventArgs) -> { final String act = activityEventArgs.getActivity().serialize(); log.info("Received activity {} audio", activityEventArgs.hasAudio() ? "with" : "without"); if (activityEventArgs.hasAudio()) { playAudioStream(activityEventArgs.getAudio()); } });

调用 connectAsync() 方法将 DialogServiceConnector 连接到 Direct Line Speech。 若要测试机器人,可以调用 listenOnceAsync 方法以从麦克风发送音频输入。 此外,还可以使用 sendActivityAsync 方法以序列化字符串的形式发送自定义活动。 这些自定义活动可以提供机器人在聊天中使用的其他数据。

connector.connectAsync(); // Start listening. System.out.println("Say something ..."); connector.listenOnceAsync(); // connector.sendActivityAsync(...)

保存对 Main 文件的更改。

为支持响应播放,请添加一个额外的类,该类用于将从 getAudio() API 返回的 PullAudioOutputStream 对象转换为一个 java InputStream 以方便处理。 此 ActivityAudioStream 是一个专用类,用于处理来自 Direct Line Speech 通道的音频响应。 它提供访问器来提取处理播放所需的音频格式信息。 为此,请选择“文件”>“新建”>“类”。

在“新建 Java 类”窗口中,在“包”字段中输入 speechsdk.quickstart,在“名称”字段中输入 ActivityAudioStream。

打开新建的 ActivityAudioStream 类,将其内容替换为以下代码:

package com.speechsdk.quickstart; import com.microsoft.cognitiveservices.speech.audio.PullAudioOutputStream; import java.io.IOException; import java.io.InputStream; public final class ActivityAudioStream extends InputStream { /** * The number of samples played per second (16 kHz). */ public static final long SAMPLE_RATE = 16000; /** * The number of bits in each sample of a sound that has this format (16 bits). */ public static final int BITS_PER_SECOND = 16; /** * The number of audio channels in this format (1 for mono). */ public static final int CHANNELS = 1; /** * The number of bytes in each frame of a sound that has this format (2). */ public static final int FRAME_SIZE = 2; /** * Reads up to a specified maximum number of bytes of data from the audio * stream, putting them into the given byte array. * * @param b the buffer into which the data is read * @param off the offset, from the beginning of array b, at which * the data will be written * @param len the maximum number of bytes to read * @return the total number of bytes read into the buffer, or -1 if there * is no more data because the end of the stream has been reached */ @Override public int read(byte[] b, int off, int len) { byte[] tempBuffer = new byte[len]; int n = (int) this.pullStreamImpl.read(tempBuffer); for (int i = 0; i < n; i++) { if (off + i > b.length) { throw new ArrayIndexOutOfBoundsException(b.length); } b[off + i] = tempBuffer[i]; } if (n == 0) { return -1; } return n; } /** * Reads the next byte of data from the activity audio stream if available. * * @return the next byte of data, or -1 if the end of the stream is reached * @see #read(byte[], int, int) * @see #read(byte[]) * @see #available *

*/ @Override public int read() { byte[] data = new byte[1]; int temp = read(data); if (temp “生成项目” 。

若要启动应用程序,请按 Shift+F10 或选择“运行”>“运行‘应用’” 。

在出现的部署目标窗口中,选择 Android 设备。

启动应用程序及其活动后,请单击该按钮开始与机器人对话。 当你讲话时,会显示听录的文本,并显示从机器人收到的最新活动。 如果机器人配置为提供语音响应,则会自动播放语音转文本结果。

后续步骤

浏览 GitHub 上的 Java 示例



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3